AutoClass: A Bayesian Classification System

نویسندگان

  • Peter C. Cheeseman
  • James Kelly
  • Matthew Self
  • John C. Stutz
  • Will Taylor
  • Don Freeman
چکیده

This paper describes AutoClass H, a program for automatically discovering (inducing) classes from a database, based on a Bayesian statistical technique which automatically determines the most probable number of classes, their probabilistic descriptions, and the probability that each object is a member of each class. AutoClass has been tested on several large, real databases and has discovered previously unsuspected classes. There is no doubt that these classes represent new phenomena.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology

Recently, several theoretical and applied studies have shown that unsupervised Bayesian classification systems are of particular relevance for biological studies. However, these systems have not yet fully reached the biological community mainly because there are few freely available dedicated computer programs, and Bayesian clustering algorithms are known to be time consuming, which limits thei...

متن کامل

Using Bayesian Classification for Aq-based Learning with Constructive Induction

To obtain potentially interesting patterns and relations from large, distributed, heterogeneous databases, it is essential to employ an intelligent and automated KDD (Knowledge Discovery in Databases) process. One of the most important methodologies is an integration of diverse learning strategies that cooperatively performs a variety of techniques and achieves high quality knowledge. AqBC is a...

متن کامل

Bayesian Classification (AutoClass): Theory and Results

We describe AutoClass, an approach to unsupervised classiication based upon the classical mixture model, supplemented by a Bayesian method for determining the optimal classes. We include a moderately detailed exposition of the mathematics behind the AutoClass system. We emphasize that no current unsupervised classiication system can produce maximally useful results when operated alone. It is th...

متن کامل

Scalable Parallel Clustering for Data Mining on Multicomputers

This paper describes the design and implementation on MIMD parallel machines of P-AutoClass, a parallel version of the AutoClass system based upon the Bayesian method for determining optimal classes in large datasets. The P-AutoClass implementation divides the clustering task among the processors of a multicomputer so that they work on their own partition and exchange their intermediate results...

متن کامل

AqBC: A Multistrategy Approach for Constructive Induction

In order to obtain potentially interesting patterns and relations from large, distributed, heterogeneous databases, it is essential to employ an intelligent and automated KDD (Knowledge Discovery in Databases) process. One of the most important methodologies is an integration of diverse learning strategies that cooperatively performs a variety of techniques and achieves high quality knowledge. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1988